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DETAILED ACTION 
Response to Amendment 

1 . In response to the office action from 5/22/2009, the applicant has submitted an 
amendment, filed 8/24/2009, amending independent claims 34-35, 38, 41, and 42, while arguing 
to traverse the art rejection based on the limitation regarding matching a text string from a 
document received by a voice browser to one of a plurality of occurrences of at least one text 
string by searching only within a prompt class (Amendment, Pages 8-9). Applicant's arguments 
have been fully considered, however the previous rejection is maintained due to the reasons 
listed below in the response to arguments. 

2. Amended claim 34 now includes a hardware-based processor which eliminates the 
possibility of a software only embodiment (Amendment, Pages 6-7). As such, the previous 
corresponding 35 U.S.C. 101 rejection has been withdrawn. Although the examiner still 
maintains that rendering audio is a post solution activity step that is not a transformation of 
matter as is argued by the applicant (Amendment, Pages 6-7), but a transformation of data from 
one form to another, amended claim 35 does pass one prong of the "machine-or-transformation 
test" in that ties the recited process to another statutory class (i.e., apparatus). Thus, the previous 
corresponding 35 U.S.C. 101 rejection has been withdrawn. 



Response to Arguments 
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3. Applicant's arguments have been fully considered but they are not persuasive for the 
following reasons: 

4. With respect to Claim 34, the applicant argues that Ladd et al (U.S. Patent: 6,269,336) 
and Malsheen et al (U.S. Patent: 5,634,084) fail to teach matching "a text string from the 
document received by the voice browser to one of the plurality of occurrences of the at least one 
text "string by searching only within the prompt class" (Amendment, Page 8). The applicant 
supports this position by alleging that the prior art is deficient in this limitation because Ladd et 
al (U.S. Patent: 6,269,336) only teaches a system that delivers prompts in a different sequence 
depending on user input and text strings with a single audio representation. Thus, the applicant 
argues that Ladd fails to teach any type of determining of a prompt class and the aforementioned 
matching limitation (Amendment, Page 8). The applicant also argues that the teachings of 
Malsheen et al (U.S. Patent: 5,634,084) fail to teach the matching step because Malsheen only 
describes a translation table and fails to describe determining a prompt class for a voice browser 
(Amendment, Pages 8-9). 

In response, the examiner notes that it is the combination of the teachings of Ladd and 
Malsheen that teaches the aforementioned claim limitation. Specifically, in Ladd, a user 
utterance is recognized at a voice application and responded to with an appropriate audio prompt 
(Col. 10, Lines 13-21). The voice application in Ladd references a VoiceHTML document that 
specifies text elements to be synthesized into speech (Col. 16, Lines 41-57; Col. 18, Lines 33-39; 
Col. 29, Lines 36-57). Ladd's voice application is also capable of handling a number of different 
navigational contexts pertaining to placing orders, obtaining phone numbers, providing 
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directions etc (Col. 2, Lines 48-58). These applications are such that they deal with terms that 
include multiple pronunciation instances (Col. 18, Lines 56-65). So, while Ladd does teach the 
maintenance of a browser state/history context for prompt issuance in that specific context, Ladd 
is deficient in searching for an audio prompt corresponding to the text in that specific context or 
"prompt class". 

Malsheen, however, recites a system for providing a speech output to a user via text-to- 
speech conversion (Abstract). Malsheen's system further features a translation table (Fig. 1, 
Element 146) that contains what is analogous to the applicants claimed "prompt class" in the 
form of a type classification (Col. 7, Lines 4-16; Abstract; and "Qual" in Table 1). Text to be 
output as speech is compared with the table or "searched' in the specific type classification in 
order to determine the proper speech output ("membership in ...predefined classes of words", 
Col. 5, Lines 30-50; the class of a word is determined and then the particular expander search 
based on that class is performed, Col. 7, Lines 20-35). Malsheen also provides various examples 
of expansion/pronunciation procedures (Col. 9, Lines 30-60; Col. 10, Lines 25-62; and Table 1). 
Thus, Malsheen performs a contextual search within a specific category or "prompt class". 
When taken in combination with Ladd, Malsheen's speech synthesis expansion means would 
allow text forms of a prompt to be properly spoken (Malsheen, Col. 2, Lines 53-60) and 
conveniently/efficiently located in a look up table based on type. This would be particularly 
useful to Ladd because Ladd's voice application features many applications that may have 
similar text items with different pronunciations (for example, numbers could be associated with 
temperatures for a weather application, ordering products, or stock information; abbreviations 



Application/Control Number: 09/933,956 Page 5 

Art Unit: 2626 

could be associated with stocks or directions, Ladd, Col. 2, Lines 48-58 and Col. 10, Lines 58- 
67). 

In regards to the applicant's first argument then, Ladd teaches determining a context of a 
voice browser, while Malsheen relates determining a particular speech output pertaining to a 
specific context class. Since Ladd teaches determining a particular context pertaining to text 
strings that would have different pronunciations when rendered into speech and Malsheen relates 
the contexts to different prompt types for generating speech, it is the combination of the prior art 
of record that teaches the aforementioned claim limitation. In regards to the applicant's second 
argument, the examiner notes that, as was pointed out above, Malsheen searches for and matches 
an expanded speech output with a text input based on a type classification. Malsheen is not 
relied upon for the teaching of a voice browser, which is taught by Ladd. Again, the examiner 
notes that it is the combination of Ladd's voice browser context identification based upon user 
input and Malsheen's speech output searching procedure within a specific field that teaches the 
aforementioned claim limitation. Thus, the applicant's arguments have been fully considered, 
but are not convincing. 

The art rejections of the remaining independent and dependent claims are traversed for 
reasons similar to claim 34 (Amendment, Pages 9-10). In regards to such arguments, see the 
response directed towards claim 34. 
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Claim Rejections - 35 USC §103 

5. The following is a quotation of 35 U.S. C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

6. Claims 34-35, 37-38, and 40-43 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Ladd et al (U.S. Patent: 6,269,336) in view of Malsheen et al (U.S. Patent: 
5,634,084). 

With respect to Claim 34, Ladd discloses: 

A database referencing a plurality of audio segments, each audio segment of the plurality 
associated with an identifier that uniquely identifies that audio segment (TTS audio file database, 
each audio file having a unique identifier, Col. 10, Line 58- Col. 11, Line 11; Col. 18, Lines 33- 
44, and Col. 29, Lines 36-57); 

A prompt mapping configuration comprising a plurality of prompt classes, text strings, 
and a one-to-one association between each text string and a corresponding audio segment 
identifier (mapping of prompts for various classes and text strings, wherein there is a one-to-one 
association between the audio prompt files and the text strings, Col. 18, Lines 33-44; Col. 29, 
Lines 36-57); 

A prompt audio object is configured to use the contextual information from the voice 
browser to determine a prompt class to match a text string form the document received by the 
voice browser to an audio file (browser context or state is utilized in determining which prompt, 
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corresponding to a text string, is to be played, Col. 10, Lines 13-21; Col. 16, Lines 41-57; Col. 
18, Lines 12-32; Col. 18, Lines 33-44; and Col. 29, Lines 36-57), wherein the match, through the 
association of text string occurrences to audio segment identifiers results in identification of an 
audio segment identifier associated with the text string occurrence, and to cause rendering of an 
audio segment, referenced in the database, that is identified by the audio segment identifier 
(generating specific audio prompts based on XML mapping and user voice browser inputs, Col. 
10, Line 58- Col. 11, Line 11; Col. 17, Line 61- Col. 18, Line 44; Col. 37, Line 8- Col. 40, Line 
24; Col. 29, Lines 36-57). 

Also, Ladd additionally teaches method implementation using a computer processor (Col. 
6, Line 65- Col. 7, Line 1 7) that would inherently require some type of instruction memory to 
enable instruction storage. 

Although Ladd teaches a voice browser system that is capable of generating an audio 
prompt based on a voice browser user input context for a plurality of the contexts (Col. 2, Lines 
48-58; and Col. 18, Lines 56-65) and utilizes a prompt mapping configuration, Ladd does not 
explicitly teach a prompt mapping configuration having a plurality of occurrences of the same 
text strings, wherein each of the occurrences of each text string are associated with a prompt 
class and corresponding audio segment identifier (i.e., one-to-one association), which is different 
from the other occurrences of that text string and a matching processes to identify an audio 
segment identifier matching the string occurrence within a prompt class. Malsheen, however, 
discloses such a mapping configuration. First, Malsheen discloses a speech output abbreviations 
translation table (Fig. 1, Element 146). This table features a plurality of speech prompt classes 
(type classification, Col. 7, Lines 4-16; Abstract; and "Qual" in Table 1). This table maps a 
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single instance of a text string to multiple possible occurrences/expansions in each of the 
different classes (see examples in Col. 9, Lines 30-60; Col. 10, Lines 25-62; and Table 1). Each 
possible expansion occurrence in turn maps to a particular audio signal to be generated at a text- 
to-speech converter (Col. 4, Lines 6-16; and Col. 12, Line 30-39). 

With response to the claimed prompt audio object means/step, Malsheen teaches that a 
text in a document is processed to generate a classification based on a neighboring context 
(Abstract; Col. 3, Lines 6-16; Col. 9, Lines 25-60; and Col. 10, Lines 25-62). Malsheen's 
invention also tries to identify a matching expansion occurrence within the classification 
category to further determine a corresponding audio output to be generated via speech synthesis 
(Abstract, Col. 3, Lines 6-16, Col. 4, Lines 6-16; Col. 9, Lines 25-60, and Col. 10, Lines 25-62). 

Ladd and Malsheen are analogous art because they are from a similar field of endeavor in 
speech synthesis. Thus, it would have been obvious to a person of ordinary skill in the art, at the 
time of invention, to modify the teachings of Ladd with the classification-based speech synthesis 
taught by Malsheen in order to provide the proper human pronunciation of words that would not 
be properly spoken by a convention text-to-speech converter (Malsheen, Col. 2, Lines 53-60). 

Claim 35 recites a method performed by the system recited in claim 34, which is taught 
above by the combination of Ladd and Malsheen, and as such, is rejected under similar rationale. 

With respect to Claim 37, Ladd further discloses: 

The association of audio segment identifiers with the reference text strings is specified in 
a markup language (prompt is associated with an identifier in VoiceHTML, Col. 18, Lines 33-44; 
and Col. 29, Lines 36-57). 
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Claim 38 contains subject matter similar in scope to claim 35, and thus, is rejected under 
similar rationale. Also, Ladd discloses method implementation as a program stored on a 
computer readable medium (Col. 6, Line 65- Col. 7, Line 17). 

Claim 40 contains subject matter similar in scope to claim 37, and thus, is rejected under 
similar rationale. 

Claim 41 contains subject matter similar in scope to claim 38, and thus, is rejected under 
similar rationale. Also, Ladd additionally teaches various browser contexts (Col. 2, Lines 48-58; 
Col. 18, Lines 12-65), while Malsheen discloses the multiple prompt classes {Table 1, "Qual"). 

Claim 42 contains subject matter similar in scope to claims 34 and 38, and thus, is 
rejected under similar rationale. Also, Ladd additionally teaches method implementation using a 
computer processor (Col. 6, Line 65- Col. 7, Line 17) that would inherently require some type of 
instruction memory to enable instruction storage. 

With respect to Claim 43, Ladd further discloses a VoiceHTML document {Col. 18, 
Lines 33-44; and Col. 29, Lines 36-5; Col. 12, Lines 25-27). 

7. Claims 36 and 39 are rejected under 35 U.S.C. 103(a) as being unpatentable over Ladd 
et al in view of Malsheen et al and further in view of Saylor et al (U.S. Patent: 6,501,832). 

With respect to Claim 36, Ladd in view of Malsheen discloses the method for context- 
based audio prompts in a voice browser, as applied to Claim 35. Ladd in view of Malsheen does 
not specifically suggest additionally selecting an audio advertisement to render based on 
contextual information, however, Saylor discloses voice advertisement elements indexed to a 
particular pertinent voice page context (Col. 14, Lines 46-62; Col. 18, Lines 46-65; Col. 27, 
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Lines 33-56; Col. 36, Line 48- Col. 37, Line 3; and example of indexed voice ad, Col. 38, Line 
33- Col. 39, Line 12). 

Ladd, Malsheen, and Saylor are analogous art because they are from a similar field of 
endeavor in speech synthesis systems. Thus, it would have been obvious to a person of ordinary 
skill in the art, at the time of invention, to modify the teachings of Ladd in view of Malsheen 
with the voice ads taught by Saylor in order to provide a means for revenue generation for voice 
page providers (Saylor, Col. 7, Lines 19-24). 

Claim 39 contains subject matter similar to claim 36, and thus, is rejected under similar 
rationale. 

Conclusion 

8 . THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time 
policy as set forth in 37 CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within TWO 
MONTHS of the mailing date of this final action and the advisory action is not mailed until after 
the end of the THREE-MONTH shortened statutory period, then the shortened statutory period 
will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 
CFR 1 .136(a) will be calculated from the mailing date of the advisory action. In no event, 
however, will the statutory period for reply expire later than SIX MONTHS from the mailing 
date of this final action. 



Application/Control Number: 09/933,956 
Art Unit: 2626 



Page 1 1 



9. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to James S. Wozniak whose telephone number is (571) 272-7632. 
The examiner can normally be reached on M-Th, 7:30-5:00, F, 7:30-4, Off Alternate Fridays. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached at (571) 272-7602. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 

/James S. Wozniak/ 

Primary Examiner, Art Unit 2626 



